80 research outputs found
DCTM: Discrete-Continuous Transformation Matching for Semantic Flow
Techniques for dense semantic correspondence have provided limited ability to
deal with the geometric variations that commonly exist between semantically
similar images. While variations due to scale and rotation have been examined,
there lack practical solutions for more complex deformations such as affine
transformations because of the tremendous size of the associated solution
space. To address this problem, we present a discrete-continuous transformation
matching (DCTM) framework where dense affine transformation fields are inferred
through a discrete label optimization in which the labels are iteratively
updated via continuous regularization. In this way, our approach draws
solutions from the continuous space of affine transformations in a manner that
can be computed efficiently through constant-time edge-aware filtering and a
proposed affine-varying CNN-based descriptor. Experimental results show that
this model outperforms the state-of-the-art methods for dense semantic
correspondence on various benchmarks
Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation
The view inconsistency problem in score-distilling text-to-3D generation,
also known as the Janus problem, arises from the intrinsic bias of 2D diffusion
models, which leads to the unrealistic generation of 3D objects. In this work,
we explore score-distilling text-to-3D generation and identify the main causes
of the Janus problem. Based on these findings, we propose two approaches to
debias the score-distillation frameworks for robust text-to-3D generation. Our
first approach, called score debiasing, involves gradually increasing the
truncation value for the score estimated by 2D diffusion models throughout the
optimization process. Our second approach, called prompt debiasing, identifies
conflicting words between user prompts and view prompts utilizing a language
model and adjusts the discrepancy between view prompts and object-space camera
poses. Our experimental results show that our methods improve realism by
significantly reducing artifacts and achieve a good trade-off between
faithfulness to the 2D diffusion models and 3D consistency with little
overhead
Memory-guided Image De-raining Using Time-Lapse Data
This paper addresses the problem of single image de-raining, that is, the
task of recovering clean and rain-free background scenes from a single image
obscured by a rainy artifact. Although recent advances adopt real-world
time-lapse data to overcome the need for paired rain-clean images, they are
limited to fully exploit the time-lapse data. The main cause is that, in terms
of network architectures, they could not capture long-term rain streak
information in the time-lapse data during training owing to the lack of memory
components. To address this problem, we propose a novel network architecture
based on a memory network that explicitly helps to capture long-term rain
streak information in the time-lapse data. Our network comprises the
encoder-decoder networks and a memory network. The features extracted from the
encoder are read and updated in the memory network that contains several memory
items to store rain streak-aware feature representations. With the read/update
operation, the memory network retrieves relevant memory items in terms of the
queries, enabling the memory items to represent the various rain streaks
included in the time-lapse data. To boost the discriminative power of memory
features, we also present a novel background selective whitening (BSW) loss for
capturing only rain streak information in the memory network by erasing the
background information. Experimental results on standard benchmarks demonstrate
the effectiveness and superiority of our approach
- …